A New Method to Determine Cluster Number Without Clustering for Every K Based on Ratio of Variance to Range in K-Means

نویسندگان

چکیده

In many clustering algorithms such as K-means and FCM, the cluster number K needs to be known beforehand. this paper, we propose a new method determine without for every in K-means. We introduce statistics RVR (ratio of variance range) conduct Monte Carlo analysis its characteristics. Based on RVR, an algorithm perform utilizing it. evaluate effectiveness by performing simulation test with different types datasets; first, real datasets, whose clusters components are second, synthetic datasets. observe significant improvement speed quality determining therefore clustering. Finally, hope proposed used efficiently widely multidimensional data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

Ranking and Clustering Iranian Provinces Based on COVID-19 Spread: K-Means Cluster Analysis

Introduction: The Coronavirus has crossed geographical borders. This study was performed to rank and cluster Iranian provinces based on coronavirus disease (COVID-19) recorded cases from February 19 to March 22, 2020. Materials and Methods: This cross-sectional study was conducted in 31 provinces of Iran using the daily number of confirmed cases. Cumulative Frequency (CF) and Adjusted CF (ACF)...

متن کامل

Hartigan's Method: k-means Clustering without Voronoi

Hartigan’s method for k-means clustering is the following greedy heuristic: select a point, and optimally reassign it. This paper develops two other formulations of the heuristic, one leading to a number of consistency properties, the other showing that the data partition is always quite separated from the induced Voronoi partition. A characterization of the volume of this separation is provide...

متن کامل

k*-Means: A new generalized k-means clustering algorithm

This paper presents a generalized version of the conventional k-means clustering algorithm [Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1, University of California Press, Berkeley, 1967, p. 281]. Not only is this new one applicable to ellipse-shaped data clusters without dead-unit problem, but also performs correct clustering without pre-assigning the exact...

متن کامل

A New Soft Computing Method for K-Harmonic Means Clustering

The K-harmonic means clustering algorithm (KHM) is a new clustering method used to group data such that the sum of the harmonic averages of the distances between each entity and all cluster centroids is minimized. Because it is less sensitive to initialization than K-means (KM), many researchers have recently been attracted to studying KHM. In this study, the proposed iSSO-KHM is based on an im...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical Problems in Engineering

سال: 2022

ISSN: ['1026-7077', '1563-5147', '1024-123X']

DOI: https://doi.org/10.1155/2022/6866747